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PREFACE 

Part  of  the  Project  RAND  research  program  consists 
of  basic  supporting  studies  in  mathematics.  In  this 
Memorandum  the  author  presents  an  alternative  mathematical 
approach  to  a  type  of  general  problem  arising  in  the 
study  of  control  processes. 


SUMMARY 


Control  problems  associated  with  the  linear  vector 
system 

^  -  Ax  +  f(t),  x(0)  -  c, 

with  f(t),  the  control  vector,  subject  to  nonclass teal 
constraints,  have  received  a  great  deal  of  attention  In 
recent  years.  In  particular,  let  us  cite  the  "bang-bang” 
process  where  each  component  of  f(t)  Is  allowed  to 
assume  only  two  distinct  values. 

In  this  paper,  we  present  an  alternative  formulation 
In  dynamic  programming  terms  which  Is  Independent  of  the 
dimension  of  x,  the  state  vector.  It  is  based  upon  an 
extension  of  the  concept  of  state  variable  and  has 
application  to  a  number  of  systems  with  switching 
characteristics.  In  its  simplest  form,  the  approach  was 
used  in  the  study  of  adaptive  control  processes. 
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DYNAMIC  PROGRAMMING,  GENERALIZED  STATES, 
AND  SWITCHING  SYSTEMS 


1 .  INTRODUCTION 

Control  problems  associated  with  the  linear  vector 
system 

(1.1)  ^  ^  f(t)i  x(0)  -  c, 

with  f(t),  the  control  vector,  subject  to  nonclassical 
constraints,  have  received  a  great  deal  of  attention  in 
recent  years.  In  particular,  let  us  cite  the  "bang-bang" 
process  where  each  component  of  f(t)  is  allowed  to 
assume  only  two  distinct  values;  see  [1,2,3].  Other 
references  will  be  found  in  these  sources. 

Although  there  are  many  approaches  with  varying 
degrees  of  effectiveness  now  available,  it  cannot  be  said 
that  the  problem  of  numerical  solution  of  problems  of 
this  genre  has  been  completely  resolved.  The  situation 
is,  of  course,  even  more  unsatisfactory  when  the  basic 
equation  describing  the  system  is  nonlinear.  In  this 
paper  we  wish  to  make  a  contribution  to  the  general 
problem  by  considering  the  case  where  f(t)  has  only  one 
nonzero  component.  We  may  consider  that  this  type  of 
problem  arises  in  the  case  where  the  system  is  described 
by  a  scalar  equation  of  the  form 
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(1.2) 


uW  .  g(,(f^l) 


where  v(t)  ■  +  1.  It  should  also  be  pointed  out  that 
this  particular  control  process  can  be  used  as  the  basis 
of  a  method  of  successive  approximations.  We  shall 
return  to  this  point  below. 

Control  processes  of  general  type,  with  or  without 
constraints,  can  readily  be  formulated  in  dynamic 
programming  terms;  see  [4],  [5],  Numerical  application 
of  this  formulation  is  limited  at  the  present  time  by  the 
limited  rapid— access  storage  capacities  of  current 
digital  computers. 

In  what  follows,  we  present  an  alternative  formula¬ 
tion  in  dynamic  programming  terms  which  is  independent 
of  the  dimension  of  x,  the  state  vector.  It  is  based 
upon  an  extension  of  the  concept  of  state  variable  and 
has  application  to  a  number  of  systems  with  switching 
characteristics.  In  its  simplest  form,  the  approach  was 
used  in  the  study  of  adaptive  control  processes;  see 
[5],  [6]. 

2._  EXTENDED  STATE 

In  the  classical  formulation  of  descriptive  and 
control  processes,  the  state  of  the  system  is  defined  to 
be  the  minimal  set  of  data  required  to  determine  the 
future  behavior  of  the  system;  see  [5],  [7].  Let  us  now 
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expand  this  concept  in  the  following  manner.  l*he 
extended  state  of  a  system  is  an  algorithm  which  permits 
us  to  calculate  the  state. 

The  point  of  this  is  that  specification  of  the 
algorithm  may  require  very  little  rapid— access  storage. 

On  the  other  hand,  time  is  required  for  the  calculation. 
Thus,  as  usual,  we  are  trading  time  for  rapid— access 
storage.  This  idea  has  been  used  both  in  our  previous 
VTork  in  dynamic  programming  and  in  quasilinearization  [8]  . 

3.  DISCRETE  SWITCHING  PROCESS 

Consider  the  vector  difference  system 

(3.1)  -  ?.(x„,y„),  Xq  -  c, 

where  y^  is  a  control  vector  subject  at  each  time  to 

the  condition  that  it  belong  to  a  constraint  set  R, 

y^  e  R.  Since  we  are  thinking  in  terms  of  a  digital 

computer  calculation,  there  is  no  loss  of  generality  in 

beginning  with  a  discrete  process.  Let  it  be  required 

to  choose  the  y  in  R  so  as  to  minimize 

•^n 

(3.2)  ||xj^  -  z’l , 

where  "•••''  denotes  some  measure  of  the  deviation  of 


from  z. 
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Wrlting 

(3.3)  fwCc)  ■  min  l|x^  -  , 

”  y  €R  ” 

■'n 

we  readily  obtain  the  recurrence  relation 

(3.4)  f«(c)  -  min  f„  ,  (g(c,y)),  N  >  1, 

”  yeR 

fo^c)  "  11^  “  2(1 . 

If  the  dimension  of  is  large,  this  is  not  computa¬ 

tionally  feasible;  see  [9]  for  discussion. 

4.  ALTERNATIVE  FORMULATION 

Let  us  now  consider  the  case  where  each  y  has 
only  one  nonzero  component,  say  the  first,  which  can 
assume  only  the  values  +  1.  Then  a  policy  consists  of 
a  choice  of  +1  for  stages,  -  1  for  T2  stages, 

and  so  on,  or  —  1  for  stages,  +  1  for  T2 

stages,  and  so  forth. 

We  therefore  introduce  the  extended  states 

(^•1)  ,  T2  #  .  .  .  ,  Ty^]  ,  [  — ,  ,  T2  >  .  .  .  ,  Ty^] 

at  time  n  -  +  T2  +  •  *  *  +  Ty^  >  0.  The  first  state 

indicates  that  +  1  has  been  used  for  Ty^  stages,  —  1 
for  the  next  T2  stages,  and  so  on.  We  suppose  that  all 
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are  positive.  With  the  aid  of  the  equation  in  (3.1), 
we  can  now  calculate  the  actual  state  in  phase  space. 

Introduce  the  two  functions 

(4.2)  f^(Tj^ , T2 , . . . , T^)  "  distance  from  z  at  the  end 

of  N  stages,  starting  in 
extended  state  [+,Tj^,T2, . . .  ,Ty^l , 
and  using  an  optimal  policy, 

and  f j^(Tj^,T2,  . . .  ,T^) ,  defined  similarly. 

The  principle  of  optimality  now  yields,  in  the  usual 
fashion,  the  functional  equations 

(4.3)  f+(T^,T2,...,T^) 

■  niln[f^j (Tj^ .  T2  > .  •  .  . Tj^+1) .  (T^ ,  T2 , . . . ,  #  1 )  ] , 

for  N  ^  1,  with  fQ(T2^,T2,  •  •  •  ,Tj^)  ■  ® 

quantity  calculable  using  (3.1),  and 

(4.4)  f  j|(Tj^ , T2 ,  •  •  • , T^) 

■  min[  (T|^  ,  T2 ,  • . . ,  Tj^+1) ,  f^  *  "^2  *****  '^^*  ^  ^  1  * 

for  N  ^  1 ,  with  f*^ (T^  ^T2,  •  calculable . 
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The  quantity  min  ||xj^  —  z^l  is  given  by 

h.  ^P?^UTi^IQNAL  ASPECT 

If  the  dimension  of  is  large,  and  we  restrict 

the  number  of  switchings  and  the  duration  of  the  process 
suitably,  it  is  easy  to  see  that  the  formulation  of 
Sec.  4  requires  considerably  less  rapid— access  storage 
than  the  usual  formulation  of  Sec.  3. 

Let  us  also  point  out  that  in  the  case  where  y^  has  a 
number  of  nonzero  components,  we  can  use  the  foregoing 
procedure  as  a  method  of  successive  approximations;  see 
the  discussion  of  the  general  Hitchcock— Koopmans- 
Kantorovich  scheduling  problem  in  [9]. 

PRODUCTION  PROCESSES 

In  a  number  of  production  and  experimentation 
processes,  the  equation  describing  the  system  takes  the 
form 

(6.1)  ^n+l  ■  Vn  +  ’'O  " 

where  the  matrices  are  to  be  chosen  subject  to 

constraints.  These  may  be  treated  in  the  same  way  as 
above . 
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